Reduced complexity IDCT decoding with graceful degradation

ABSTRACT

Data compressed according to a lossy DCT-based algorithm, such as the MPEG or MPEG2 algorithms, is decompressed according to a dynamically-selected set of DCT coefficients, with unused coefficients masked out. A macroblock of the data exhibiting little motion is decompressed with a small subset of DCT coefficients, while a macroblock exhibiting more motion is decompressed using a larger subset of DCT coefficients up the full set of DCT coefficients. Average computational complexity is thus kept low, enabling the use of inexpensive equipment, while degradation is minimized.

BACKGROUND OF THE INVENTION

1. Field of the Invention

The present invention relates to the decoding of compression algorithmsfor digital data, and particularly to decoding algorithms employing theInverse Discrete Cosine Transform.

2. Description of the Related Art

Digital datastreams are often compressed for purposes of storage andtransmission. Datastreams containing alphanumeric data are typicallyrequired to be absolutely unchanged after compression and decompression,but when working with audio or pictorial data it may be acceptable touse “lossy” compression in which some detail may be lost or altered butin which a human observer perceives the output as substantially similarto the original.

Many lossy compression algorithms have been devised, such as MP3 (MovingPicture Experts Group Layer-3 Audio) for sound recordings, JPEG (JointPhotographic Experts Group) for still pictures, and MPEG (Motion PictureExperts Group) and MPEG2 for video recordings. An embodiment of theinvention to be described applies primarily to MPEG2 compression, but isapplicable to other algorithms as well.

In MPEG2 compression, a video frame to be transmitted is divided intomacroblocks (MB's) of 8×8 pixels. A discrete cosine transform (DCT) isrun on the MB, yielding an 8×8 array of coefficients. The coefficients,quantized and perhaps further compressed by Huffman-tree encoding, arestored or transmitted for retrieval by a playback device.

The playback device performs an inverse discrete cosine transform (IDCT)on each 8×8 array of coefficients to reconstruct the equivalent to the8×8 array of pixels from the original frame. To recover maximum detailand accuracy, all 64 of the coefficients should be processed. (Even ifall 64 coefficients are used, there will still be some less of detailbecause of the aforementioned quantizing.) For many applications, suchas consumer entertainment, a user may be willing to sacrifice somepicture quality in order to have a lower-cost playback device. In aprior-art solution, a usable or acceptable level of picture quality isattained using fewer than all 64 of the coefficients, thus permittingthe use of a computational element of lesser capability. The number ofcoefficients used in the inverse DCT is predetermined according to adesired level of quality for a particular computational element. Thepicture quality can be quite good for homogeneous scenes with littlecamera movement and little subject movement, but degrades for highlyvariegated scenes or when there is rapid camera movement or rapidsubject movement. Picture degradation may exceed the limits of“graceful” degradation, a term of art indicating that althoughdegradation is permitted, it is managed so as to be as unobtrusive aspossible. There is thus a need for an MPEG2 playback system with abilityto process fewer than all of the DCT coefficients while maintaininggraceful degradation of picture quality.

SUMMARY OF THE INVENTION

To overcome limitations in the prior art described above, and toovercome other limitations that will be apparent upon reading andunderstanding the present specification, the present invention providesa system and method of dynamically assessing horizontal high frequencycomponents of a DCT block and decoding using a number of DCTcoefficients dynamically selected according to current level ofhigh-frequency components.

According to one aspect of the invention, the DCT component representingthe highest frequency of DCT components representing horizontalfrequency is assessed, and a masking of DCT coefficients is selectedaccordingly.

Other objects and features of the present invention will become apparentfrom the following detailed description considered in conjunction withthe accompanying drawings. It is to be understood, however, that thedrawings are designed solely for purposes of illustration and not as adefinition of the limits of the invention, for which reference should bemade to the appended claims. It should be further understood that thedrawings are not necessarily drawn to scale and that, unless otherwiseindicated, they are merely intended to conceptually illustrate thestructures and procedures described herein.

BRIEF DESCRIPTION OF THE DRAWINGS

In the drawings, wherein like reference numerals denote similarelements:

FIG. 1 depicts the organization of DCT coefficient positions in an 8×8array according to one embodiment of the invention;

FIG. 1A schematically illustrates relative frequencies represented bythe DCT coefficient positions given in FIG. 1;

FIGS. 2A through 2H show typical maskings that may be applied to decodea signal coded into DCT coefficients according to FIG. 1, and states therelative computational complexity for each;

FIG. 3 is a flow chart for an embodiment of the invention; and

FIG. 4 is a block diagram of an apparatus suitable for executing theflow of FIG. 3.

DETAILED DESCRIPTION OF THE PRESENTLY PREFERRED EMBODIMENTS

In a typical data compression scheme, such as MPEG or MPEG2 datacompression of video streams, an 8×8 array of pixels (a macroblock orMB) is extracted from a video frame, and a Discrete Cosine Transform(DCT) is performed on the MB to yield a set of DCT coefficients, whichtypically are quantized to produce an 8×8 array of DCT coefficients.

The DCT algorithm, well known in the art, is given here for reference.Given data A(i), where i is an integer in the range 0 to N−1, theforward DCT (which would be used, e.g., by an encoder) is:B(k)=[1−(1−sqrt(2)/2)delta(k)]/2 sum A(i) cos((pi k/N)(2i+1)/2)

i=0 to N−1

where delta is Kronecker's delta.

B(k) is defined for all values of the frequency-space variable k, but weonly care about integer k in the range 0 to N−1. The inverse DCT (whichwould be used, e.g., by a decoder) is:AA(i)=sum B(k)[1−(1−sqrt(2)/2)delta(k)]/2 cos((pi k/N)(2i+1)/2)

k=0 to N−1

FIG. 1 shows a typical layout of such an array in which 64 coefficientpositions are denominated 00 through 63. In the DCT algorithm as appliedto this array, N has a value of 64. Position 00 contains a DCTcoefficient representing the lowest vertical frequency in the MB andlowest horizontal frequency in the MB. Coefficients representing higherhorizontal frequencies occupy successive positions “down” the array asdepicted in FIG. 1, while coefficients representing higher verticalfrequencies occupy successive positions “across” the array as depicted.Thus, the coefficient in position 7 represents the highest verticalfrequency regardless of horizontal frequency, the coefficient inposition 56 represents the highest horizontal frequency regardless ofvertical frequency, and the coefficient in position 63 represents boththe highest horizontal frequency and the highest vertical frequency.FIG. 1A schematically illustrates the relative frequencies in the arraypositions.

In order to reproduce the original frame for playback, it is necessaryto perform an inverse discrete cosine transform (IDCT) on the 8×8 arrayof cosine coefficients to recover an approximation of the 8×8 MB fromthe original frame. It is an approximation because compressionalgorithms such as MPEG are inherently “lossy” compressionalgorithms—some detail is inherently lost or altered. However, the lossof detail may be imperceptible to the viewer. Further, it may bepossible to increase the loss of detail (in order to simplify, and thusreduce the cost of, playback equipment) while still producing an outputvideo stream that is not objectionable to the viewer.

FIG. 2A, by virtue of being completely hatched, denotes that everyposition of the 8×8 array of DCT coefficients is used in the IDCTdecoding. This is 100% of the computation complexity for reconstructinga MB. FIGS. 2B through 2H each show a typical subset of the coefficients00-63 being used in the IDCT decoding. A hatched square denotes that thecorresponding DCT coefficient from the corresponding position identifiedin FIG. 1 is used in the IDCT decoding. An unhatched square indicatesthat the corresponding DCT coefficient is set to zero, and is not used.With each of FIGS. 2B through 2H is a relative (i.e., percentage)indication of the resulting computation complexity. The degree to whichimage quality is degraded by using a subset of the DCT coefficientsdepends on the frequency complexity of the MBs. An MB that is a portionof a constant flat background, for example, would probably not showperceptible degradation even with the 38% complexity of FIG. 12H 1. Onthe other hand, an MB that is a portion of the checked shirt of a mansprinting across the scene from left to right while the camera ispanning the scene right to left would appear quite badly degraded withthe 38% complexity of FIG. 12H, and would be degraded less with eachhigher level of complexity.

Similar considerations apply to an MB exhibiting high complexity in thevertical orientation, such as the checked shirt of the man should heplummet off a cliff. In typical video program material, horizontalcomplexity is encountered far more often than vertical. The preferredembodiment of the present invention reduces degradation of horizontalcomplexity more than vertical, but it is understood that the techniquesof the present invention may also be directed toward stressing verticalcomplexity or to treating horizontal and vertical complexity equally.

A prior-art solution to providing a nominal level of viewing quality ona low-cost playback device that cannot continuously provide 100%computation capability is to always decode using one predeterminedsubset of DCT coefficients, selected according to the computationalcapabilities of the playback device. For example, for a playback devicebased on a 100 MHz Intel Pentium chip, the 55% complexity of FIG. 2Gmight always be used, but for a playback device based on a 350 MHz IntelPentium-II chip the 86% complexity of FIG. 2C might always be used. Thelatter device would produce better results, but even at that it mightproduce results with noticeable and obtrusive degradation for MBs with ahigh degree of horizontal complexity.

The present invention assesses the horizontal complexity of eachindividual MB, and selects the complexity level accordingly. Thus, inthe example of the man with a checked shirt sprinting through the scene,high-complexity decoding is used for MBs from the checked shirt or otherportions of the rapidly moving man so as to reduce degradation. Butother MBs from the frame typically exhibit much lower complexity (thebackground behind the sprinting man might be a uniform building wall ora uniform blue sky), and low-complexity decoding could be used for thoseMBs without introducing objectionable degradation.

Referring again to FIG. 1, the magnitude of coefficient 56 is indicativeof the horizontal complexity of the current MB at the highest horizontalfrequency, and thus coefficient 56 is used as a bellwether to select thecomplexity of processing to be applied for the current MB. Thisselection is of great importance if the video data is interleaved (whichis the case for most TV signals). The case where the data is interleavedand DCT coded after interleaving is known as frame-type DCT (as opposedto field-type DCT, performed on uninterleaved MBs). In interleaved data,a top field may be very different from a bottom field, in which casecoefficient 56 will have a very high value. The prior-art solution ofblindly using a fixed decoding complexity tends to result inobjectionable degradation in such cases. Viewers have reported becomingdizzy from viewing such output. High vertical complexity tends to occurmuch less often in typical program material. (Other embodiments mightuse coefficient 63 (or some other coefficient along the main diagonal ofthe array) if it were desired to minimize horizontal and verticaldegradation equally, or coefficient 07 might be used if it were desiredto minimize only vertical degradation.)

For MBs having low horizontal complexity (from a uniform background, forexample) the magnitude of coefficient 56 is very low, and thelow-complexity encoding of FIG. 12H could accordingly used to decode theMB without introducing significant degradation. Higher levels ofcomplexity of encoding are used for higher values of coefficient 56,thus keeping degradation down to acceptable values. For MBs for whichcoefficient 56 exceeds a predetermined threshold value, the 100%complexity of FIG. 12A, in which all 64 DCT coefficients are used, couldbe employed. For virtually all typical frames, the average computationalcomplexity is well below 100%, even if 100% complexity decoding is usedfor some of the MBs comprising the frame.

In a present embodiment of the invention, only one threshold value forcoefficient 56 is defined; for values below the threshold, thecoefficient subset depicted in FIG. 2G, with 55% relative complexity, isemployed; for values at or above the threshold, the coefficient subsetdepicted in FIG. 2C, with 86% relative complexity, is employed.

An embodiment of the invention is described in flowchart form in FIG. 3.For each MB of each frame, an 8×8 array of DCT coefficients is received(block 302), typically from a storage means or a transmission means. Inblock 304, the value of coefficient 56 is assessed. As discussed supra,coefficient 56 is associated with the highest frequency of horizontalmotion, and the present embodiment seeks to minimize horizontaldegradation while permitting vertical degradation since verticaldegradation occurs much less frequently in typical program material.

In block 306, according to a predetermined association of the maskingsfor subsets of DCT coefficients (FIG. 2) with the value of coefficient56, a predetermined one of the maskings is selected. In block 308 theselected subset of DCT coefficients is used in an inverse-DCT operationto recover an approximation of the original macroblock. With the dynamicselection of coefficient subsets according to the value of coefficient56, lower complexity is used when there is not much horizontal motion,and higher complexity is used to minimize degradation for variousgreater amounts of horizontal motion. A present embodiment employs oneof two subset selections: the 55% complexity subset of FIG. 12C forvalues of coefficient 56 below a predetermined threshold, and the 86%complexity subset of FIG. 12G for values at or above the predeterminedthreshold.

Block 310 dispatches back to block 302 so that each MB of a frame isprocessed. Block 312 dispatches back to block 302 to process each framein a video stream.

Apparatus for carrying out the operations described herein may, as amatter of design choice, be constructed in special-purpose hardware, orin general-purpose digital logic hardware programmed by appropriatefirmware or software. Such an apparatus 400 is block-diagrammed in FIG.4. It contains a data receiver 402 for receiving input data; a datastore 404 for storing computer instructions and data (input data,intermediate data, processed output data, and working data such as thepredetermined DCT subsets); a computation means 406; control logic 408;and a data transmitter 410 for outputting data.

Thus, while there have been shown and described and pointed outfundamental novel features of the invention as applied to a preferredembodiment thereof, it will be understood that various omissions andsubstitutions and changes in the form and details of the devicesillustrated, and in their operation, may be made by those skilled in theart without departing from the spirit of the invention. For example, itis expressly intended that all combinations of those elements and/ormethod steps which perform substantially the same function insubstantially the same way to achieve the same results are within thescope of the invention. Moreover, it should be recognized thatstructures and/or elements and/or method steps shown and/or described inconnection with any disclosed form or embodiment of the invention may beincorporated in any other disclosed or described or suggested form orembodiment as a general matter of design choice. It is the intention,therefore, to be limited only as indicated by the scope of the claimsappended hereto.

1. A method of decoding DCT-encoded blocks of a data signal, the methodcomprising: (a) predetermining a plurality of subsets of DCT coefficientpositions; (b) receiving a set of DCT coefficients obtained fromDCT-encoding a corresponding portion of a data signal; (c) selecting oneof said subsets of DCT coefficient positions according to a value of apredetermined one of the received DCT coefficients; (d) performing IDCTdecoding on the selected subset of DCT coefficients to recover arepresentation of the corresponding portion of the data signal; and (e)repeating steps (b), (c), and (d) for successive sets of DCTcoefficients.
 2. (canceled)
 3. (canceled)
 4. (canceled)
 5. (canceled) 6.(canceled)
 7. (canceled)
 8. Apparatus for decoding DCT-encoded blocks ofa data signal, the apparatus comprising: a data store for storing apredetermined plurality of subsets of DCT coefficient positions; areceiver for receiving a set of DCT coefficients obtained fromDCT-encoding a portion of said data signal; computation means for:selecting one of said subsets of DCT coefficient positions according toa value of a predetermined one of the received DCT coefficients; andperforming IDCT decoding on the selected subset of DCT coefficients torecover a representation of the corresponding portion of the datasignal; and control logic for routing successive sets of DCTcoefficients through the receiver and computation means.
 9. (canceled)10. (canceled)
 11. (canceled)
 12. (canceled)
 13. (canceled) 14.(canceled)